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Abstract — A Network- on- chip (NOC) is a new paradigm in complex system- on- chip (SOC) designs that provide efficient on chip communication 
networks. The data is routed through the networks in terms of packets. The routing of data is mainly done by routers. So the architecture of router must 
be an efficient one with a lower latency and higher throughput. In this project we designed, implemented and analyzed crossbar router architectures for 
a network on chip communication in a FPGA. The routers have five ports, four ports connected to other ports in four dfferent directions and theffth 
port connected to the processing element through a network interface. Our Proposed architecture contains 4x4 crossbar switch, switch allocator, path 
and channel request, data ram and 4 i/o ports. The datas ere sent through the routers in order to prevent congestion. The swich allocator and VC 
allocator are used to allocate the datas in priority order. The switch allocator will allocate the datas according to the path and channel request. The XY 
algorithm with a scheduler is used in this project for proper destination of the datas. 

Keywords: NOC, FPGA, switch allocator, VC alloctor, ports. 


I. INTRODUCTION 


Very large-scale integration (VLSI) is the process of integrating or embedding hundreds of thousands of transistors on a single silicon 
semiconductor microchip. This is the field which involves packing more and more logic devices into smaller and smaller areas. VHDL 
(VHSIC Hardware Description Language) is a hardware description language used in electronic design automation to describe digital 
and mixed signal systems such as FPGA and integrated circuits. VHDL can also be used as a general purpose parallel programming 
language. 

The disadvantage of using VHDL are, the modules must be defined by a prototype, the use of the keyword “down to” in every bit 
vector definition is tedious, missing a single signal in the sensitivity list can cause catastrophic differences between simulation and 
synthesis , each process must have a sensitivity list that may sometimes be very long. Verilog, standardized as IEEE 1364, is a hardware 
description language (HDL) used to model electronic systems. It is most commonly used in the design and verification of digital 
circuits at the register-transfer level of abstraction. It is also used in the verification of analog circuits and mixed-signal circuits. 


The advantages of using verilog coding are verification through simulation, it allow architectural trade of bit short turn around, enable 
automatic synthesis, reduce time for design capture and it is easy to change. 

Today’s SoCs need a network on chip IP interconnect fabric to reduce wire routing congestion, to ease timing closure, for higher 
operating frequencies and to change IP easily. Network on chips are a critical technology that will enable the success of future system 
on chips for embedded applications. This technology of network on chip is expected to dominate computing platforms in the near 
future. The paper is organized as follows: Section II explains about the existing overview of the algorithms. Section III explains the 
proposed method. Section IV discusses about results. Finally, Section V provides the conclusion 
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II. Existing Overview 

The input ports buffer input flits and send requests to the allocators. The routing computation module determines the output port 
based on the routing algorithm. After the route computation, a free output VC (OVC) in the next router is assigned to the input VC 
(I VC) by sending request to the VC allocator. If an OVC is successfully assigned, then another allocation request will be sent to the 
switch allocator. The crossbar is then configured to send the desired flit to the output port if the switch allocation request is granted. 
In order to send requests to the switch allocator, the available space in the next router buffer must be known. In the existing system 
the routers ore used by using the dynamic algorithms like XY algorithm. 

The design tradeoffs for hard and soft FPGA-based networks-on-chip proposed by M. S. Abdelfattah and V. Betz, presents the design 
of NOC by using the router. In this paper there is a chance of congestion since it does not have the allocator. We remove control 
overheads (routing and arbitration logic) from the critical path in order to minimize cycle-time and latency. 

The Design of On-the-fly Virtual Channel Allocation for Low Cost High Performance On-Chip Router proposed the on-the-fly 
virtual channel (VC) allocation for low cost high performance on-chip routers. By performing the VC allocation based on the result of 
switch allocation, the dependency between VC allocation and switch traversal is removed and these stages can be performed in 
parallel. 


III. Proposed Method 

In the proposed system low latency router micro architecture with VC allocator and switch allocator is used. Any input flit that is 
passing through the switch can be successfully delivered at the output as the path request is sent through the VC allocator. The switch 
and VC allocator is designed in parallel. The scheduler is used with the XY algorithm in order to transfer the datas properly. To reduce 
the communication latency while maintaining good throughput, a router needs to perform several stages such as route computation, 
VC allocation, and switch allocation in parallel. 


In the proposed NOC router architecture as shown in figure 1, any request which has been granted service by the switch allocator is 
able to pass a flit to the output port successfully. An efficient masking technique is proposed to filter all switch allocation requests that 
are not able to pass flits to the output port, either due to the lack of free space in assigned VC or due to the lack of free VC in the 
output port for non assigned VC requests. Our proposed technique has minimal impact in timing and area overhead of an NOC 
router. It is also fully parameterizable in terms of number of VCs, buffer width, and flit width. 
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Fig 1 . Block Diagram 

A crossbar switch (cross-point switch, matrix switch) is a collection of switches arranged in a matrix configuration. A crossbar switch 
has multiple input and output lines that form a crossed pattern of interconnecting lines between which a connection may be established 
by closing a switch located at each intersection, the elements of the matrix. 

Virtual channel router (VCR) is a router which uses wormhole network flow control with virtual channels. This router architecture 
has 5 input and output ports. Four of them are connected to neighbor routers and one is for router’s local core. Each input port has 4 
virtual channels which are de-multiplexed and buffered in FIFOs. After FIFOs the virtual channels are multiplexed again to a single 
channel that goes to a crossbar. Routing operations in the crossbar are controlled by an arbitration unit (AU). Arbitration unit also 
takes care that there are no conflicts between virtual channels and that the arbitration is fair. 


Each packet maintains state indicating the availability of buffer space at their assigned output VC. When flits are waiting to be sent, and 
buffer space is available, an input VC will request access to the necessary output channel via the router’s crossbar. On each cycle the 
switch allocation logic matches these requests to output ports, generating the required crossbar control signals. 

After masking the IVC requests, these requests are sent to the switch allocator. Due to having two levels of arbitrations in the switch 
allocator, arbiter delay is an important parameter in defining the NOC critical path. Hence, to minimize the arbitration delay, fast 
arbiter proposed. The VC allocation stage assigns an empty VC in the neighboring router connected to the output port. Since several 
header flits may send requests for the same VC, arbitration is required. The routing computation as well as the VC allocation only 
requires the header flit. The body and tail flits will follow their respective header flit. 
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If VC allocation is successful, the third stage sends request to the switch allocator to allocate the output port. Each packet maintains 
state indicating the availability of buffer space at their assigned output VC. When flits are waiting to be sent, and buffer space is 
available, an input VC will request access to the necessary output channel via the router’s crossbar. The separable input-first allocators 
have the advantage of lower communication delay, area overhead, and power consumption compared to other schemes. Hence, the 
separable input-first allocator has been chosen to be implemented in our low latency NOC router. A separable input-first allocator 
consists of two levels of arbitrations. 

Routing algorithm determines the output port which a packet must be sent to reach its destination. Deterministic routings act well 
when dealing with uniform traffic where congestion has been distributed equally across all links in an NOC. However, the nature of 
NOC traffic is bursts which results in imbalanced distribution of traffic across all links. Hence, deterministic routing results in poor 
performance for such traffic. As packets can be sent to multiple ports, a port selection module is required to select the desired output 
port among them. In the case of look-ahead deterministic routing algorithm, only single output port is selected and it can be directly 
used in our proposed design. 

IV. Results 


In this paper, the datas can easily reach the destination by using the routers. The routers help in guiding the datas to the required 
output ports. The switch has fours input and output ports. The inputs are given in four directions north, south, east and west. In the 
same way the outputs are obtained. 



fig 2 . Input Request 



In this fig 2, the input channel is requested through the router and waiting for the acknowledgement from the output side. The datas are 
given in four directions. 


Fig 3. Input Acknowledgement 


The input channel acknowledgement is shown in Figure 3. 



Fig 4. Output channel request . 


The output channel request is shown in Figure 4. 
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Fig 5 . output acknowledgement 
The output channel acknowledgement is shown in figure 5. 


Table I. 



The number of slices, flip flops and 1/ O ports that are used is shown in table 1 . 

V. Conclusion 

In this work a Network- on- chip (NOC) is a new paradigm in complex system- on- chip (SOC) designs that provide efficient on chip 
communication networks was proposed. It allows scalable communication and allows decoupling of communication and computation. 
In this project we designed, implemented and analyzed crossbar router architectures for a network on chip communication in a FPGA. 
Our Proposed architecture is optimized in five main criteria, which are 4x4 crossbar switch, switch allocator, path and channel 
request, data ram and 41/0 ports compared to existing works. 
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